Skip to content
This repository has been archived by the owner on Jan 13, 2025. It is now read-only.

disk index: store ref_count in data file #30974

Merged

Conversation

jeffwashington
Copy link
Contributor

@jeffwashington jeffwashington commented Mar 29, 2023

Problem

See #30711

image

Top line is master now.
On the right graph, the drop in the blue line is the index file shrinking thanks to index entry no longer containing u64 refcount.
The left graph is total data file size. The very bottom line is the change that stores slotlist=1 elements in the index file (4MB). The line just above it is THIS change, which adds a u64 per element so we can store ref_count. Not sure why we have a handful of data files. 6MB total allocated now when it used to be 4MB.
Note that both scales are log.
The very bottom line on the right graph is a test validator that increases the search length, thus, reducing the # of over-allocations we attempt.
These compound, so with that change, we’d be in the mid 20GB for the entire index basically.
Beats 74GB or whatever the previous impl was at.

Summary of Changes

When ref_count != 1, store ref_count per element in the data file. This means the 99.9% case of ref_count=1 does not have to be stored per allocated element.

Fixes #

@codecov
Copy link

codecov bot commented Mar 29, 2023

Codecov Report

Merging #30974 (c23503f) into master (9cbaf0e) will decrease coverage by 0.1%.
The diff coverage is 94.1%.

@@            Coverage Diff            @@
##           master   #30974     +/-   ##
=========================================
- Coverage    81.5%    81.4%   -0.1%     
=========================================
  Files         728      728             
  Lines      205758   205776     +18     
=========================================
- Hits       167746   167701     -45     
- Misses      38012    38075     +63     

@jeffwashington jeffwashington force-pushed the mm38_refcount_in_data branch 6 times, most recently from 450ce49 to fc53564 Compare April 4, 2023 15:23
@jeffwashington jeffwashington marked this pull request as ready for review April 4, 2023 15:25
@@ -412,7 +432,7 @@ mod tests {
#[test]
fn test_size() {
assert_eq!(std::mem::size_of::<PackedStorage>(), 1 + 7);
assert_eq!(std::mem::size_of::<IndexEntry<u64>>(), 32 + 8 + 8 + 8);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look at the magic disappearance of that 8!

brooksprumo
brooksprumo previously approved these changes Apr 4, 2023
Copy link
Contributor

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jeffwashington
Copy link
Contributor Author

apologies again - more merge conflicts

@jeffwashington jeffwashington merged commit 9fb22bc into solana-labs:master Apr 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants